Automatic Generation of SIMD DSP Code

نویسندگان

  • Franz Franchetti
  • Markus Püschel
چکیده

Short vector SIMD instructions on recent microprocessors, such as SSE on Pentium III and 4, speed up code but are a major challenge to software developers. This report introduces a compiler that automatically generates C code enhanced with short vector instructions for digital signal processing (DSP) transforms, such as the fast Fourier transform (FFT). The input to the compiler is a concise mathematical description of a DSP algorithm in the language SPL. SPL is used in the Spiral system (http://www.ece.cmu.edu/∼spiral) to generate highly optimized architecture adapted implementations of DSP transforms. Interfacing the newly developed compiler with Spiral yields speed-ups of up to a factor of 2 in several important cases including the FFT and the discrete cosine transform (DCT) used, for instance, in the JPEG compression standard. For the FFT the automatically generated code is competitive with the hand-coded Intel Math Kernel Library (MKL).

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A HW/SW design methodology for embedded SIMD vector signal processors

SIMD processors have made their way from supercomputers architectures through embedded real-time signal processing. This trend has been driven by signal processing applications with heavy number-crunching requirements like for example base-band processing on mobile devices. Depending on the data dependencies of algorithms and implementation constraints like real-time, power consumption and die ...

متن کامل

Short Vector SIMD Code Generation for DSP Algorithms

Short vector SIMD instructions on recent general purpose microprocessors, such as SSE on Pentium III and 4, offer a high potential speed-up but require a very high level of programming expertise. We present a compiler that generates vectorized code for digital signal processing algorithms such as the fast Fourier transform (FFT). The input to our compiler is a mathematical description of the al...

متن کامل

Performance Evaluation of Parallel Simd

A simulator for SIMD type architectures is presented. Starting from an architecture independent algorithm description based on recurrence equations, transformation steps for automatic parallelization, mapping and code generation are outlined. The nal pseudo code program together with architecture dependent parameters and execution time tables, are fed into the simulator in order to gain perform...

متن کامل

Short vector code generation and adaptation for DSP algorithms

Most recent general purpose processors feature short vector SIMD instructions, like SSE on Pentium III/4. In this paper we automatically generate platform-adapted short vector code for DSP transform algorithms using SPIRAL. SPIRAL represents and generates fast algorithms as mathematical formulas, and translates them into code. Adaptation is achieved by searching in the space of algorithmic and ...

متن کامل

Automatic SIMD Parallelization of Embedded Applications Based on Pattern Recognition

This paper investigates the potential for automatic mapping of typical embedded applications to architectures with multimedia instruction set extensions. For this purpose a (pattern matching based) code transformation engine is used, which involves a three-step process of matching, condition checking and replacing of the source code. Experiments with DSP and the MPEG2 encoder benchmarks, show t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2001